Towards a Unified Exploitation of Electronic Dialectal Corpora: Problems and Perspectives

نویسندگان

Nikitas N. Karanikolas

Eleni Galiotou

Angela Ralli

چکیده

In this paper, we deal with the problem of storing and retrieving dialectal data in a unified framework. In particular, we discuss issues concerning the design and implementation of a multimedia database which will contain written and oral data from three Greek dialects in Asia Minor. At first, we describe the overall architecture of a system aiming at providing the user with the possibility to store audio recordings, text transcripts, and other annotations. Then we discuss the possibilities and limitations of a retrieval module aiming at combining different linguistic levels for a unified exploitation of oral and written corpora.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards An Electronic Analysis of Svan Dialectal Divergences

Dies ist eine Internet-Sonderausgabe des Aufsatzes " Towards An Electronic Analysis of Svan Dialectal Divergences " von Jost Gippert (2000). Sie sollte nicht zitiert werden. Zitate sind der Originalausgabe in Kartveluri memḳvidreoba / Kartvelian Heritage 4, 2000, 134-149 zu entnehmen.

متن کامل

Comparative Study of the Academic Vocabulary Content of Electronic Engi-neering Corpora, GE Materials and M.S. Entrance Examinations

The importance of vocabulary learning has been underlined in the field of English for Academic Purposes (EAP) because non-English majors who require reading English texts in their fields of study have to expand their English vocabulary knowledge much more efficiently than ordinary ESL/EFL learners. Since academic vocabulary instruction in Iranian universities is realized through the use of Gene...

متن کامل

Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity

In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” (https://neznaika.pro/). The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’...

متن کامل

Evaluating the Use of Corpus-based Instruction in a Language Teacher Education Context: Perspectives from the Users

A recent practice in the study of language on teacher education programmes has been the use of electronic corpora, and we are therefore still at the initial stages of exploring key issues relating to their integration. Despite arguments for and against their adaptation, there is a dearth of evaluative research examining student teachers’ perceptions of learning and teaching through corpus-based...

متن کامل

A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic

This paper presents a multi-dialect, multi-genre, human annotated corpus of dialectal Arabic with data obtained from both online newspaper commentary and Twitter. Most Arabic corpora are small and focus on Modern Standard Arabic (MSA). There has been recent interest, however, in the construction of dialectal Arabic corpora (Zaidan and Callison-Burch, 2011a; Al-Sabbagh and Girju, 2012). This wor...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Towards a Unified Exploitation of Electronic Dialectal Corpora: Problems and Perspectives

نویسندگان

چکیده

منابع مشابه

Towards An Electronic Analysis of Svan Dialectal Divergences

Comparative Study of the Academic Vocabulary Content of Electronic Engi-neering Corpora, GE Materials and M.S. Entrance Examinations

Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity

Evaluating the Use of Corpus-based Instruction in a Language Teacher Education Context: Perspectives from the Users

A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic

عنوان ژورنال:

اشتراک گذاری